2023-11-13 08:59:01.AIbase.3.1k
ZhiYuan Research Institute Releases Open-Source JudgeLM Evaluation Model to Assess Various Large Models and Provide Scores
ZhiYuan Research Institute has open-sourced the JudgeLM evaluation model, which can efficiently assess various large models and provide scores. Compared to GPT-4, JudgeLM's cost is only 1/120, with a consistency rate of over 90% for evaluation results. JudgeLM can be applied in various assessment scenarios including pure text and multimodal contexts, generating scores and justifying reasons. The consistency of JudgeLM with reference answers exceeds 90%, approaching human performance. ZhiYuan Research Institute has also released datasets for training and validation samples for in-depth research on large models.